Deep Learning-Based Joint Effusion Classification in Adult Knee Radiographs: A Multi-Center Prospective Study

Knee effusion, a common and important indicator of joint diseases such as osteoarthritis, is typically more discernible on magnetic resonance imaging (MRI) scans compared to radiographs. However, the use of radiographs for the early detection of knee effusion remains promising due to their cost-effectiveness and accessibility. This multi-center prospective study collected a total of 1413 radiographs from four hospitals between February 2022 to March 2023, of which 1281 were analyzed after exclusions. To automatically detect knee effusion on radiographs, we utilized a state-of-the-art (SOTA) deep learning-based classification model with a novel preprocessing technique to optimize images for diagnosing knee effusion. The diagnostic performance of the proposed method was significantly higher than that of the baseline model, achieving an area under the receiver operating characteristic curve (AUC) of 0.892, accuracy of 0.803, sensitivity of 0.820, and specificity of 0.785. Moreover, the proposed method significantly outperformed two non-orthopedic physicians. Coupled with an explainable artificial intelligence method for visualization, this approach not only improved diagnostic performance but also interpretability, highlighting areas of effusion. These results demonstrate that the proposed method enables the early and accurate classification of knee effusions on radiographs, thereby reducing healthcare costs and improving patient outcomes through timely interventions.


Introduction
Knee effusion is a primary symptom of knee joint diseases, particularly common among patients with degenerative arthritis such as osteoarthritis [1][2][3].Without timely detection and appropriate treatment, effusion can lead to significant consequences, causing continuous joint deterioration and impacting patients' quality of life [4][5][6].
According to orthopedic diagnostic guidelines, identifying effusion in X-ray images involves recognizing a well-defined, rounded, homogeneous soft tissue density in the suprapatellar recess on lateral X-rays [7][8][9].However, effusion is often challenging to easily overlooked.While magnetic resonance imaging (MRI) offers better clarity for diagnosing knee effusion, assessing effusion in X-ray images is crucial for optimizing time and cost efficiency [10,11].Therefore, radiographic imaging plays a pivotal role in diagnosing knee effusion [12][13][14][15].
Recent advancements in radiology have shown significant research growth, particularly in the application of artificial intelligence (AI) and deep learning for radiological evaluations and automation [16][17][18][19][20]. Notably, these advancements in X-ray imaging have shown promising results for early disease detection [21,22].Despite the demonstrated efficacy of deep learning across various radiological applications, to our knowledge, no AI research exists for diagnosing knee effusion in X-ray images.Current studies have mainly focused on knee joint recognition and the severity assessment of knee osteoarthritis [23][24][25].Additionally, attempts to visualize effusion areas in joints have been limited to the elbow region [26], leaving a notable gap in similar applications for knee effusion detection.
Therefore, this study proposes an AI-based diagnostic methodology that enhances orthopedic diagnoses by classifying and visualizing knee joint effusion on X-ray imaging.Our approach involves performing image-level classification of knee effusion using novel preprocessing techniques, focusing on identifying predominant effusion sites.Additionally, we visualize the effusion areas through weakly supervised localization.

Patient Population
This multi-center prospective study was approved by the institutional review board, and written consent for all subjects was waived.We acquired X-ray images from 1413 cases from four hospitals, which were prospectively collected between February 2022 and March 2023.We excluded 132 cases based on the following criteria:   As shown in Figure 1, 300 randomly selected effusion cases in the training set were annotated with bounding boxes (bbox) around the patella by a medical AI researcher to train a patella detection model.The dataset was then divided into a training set of 200 cases (67%) and a test set of 100 cases (33%).Additionally, three orthopedic physicians, each with more than 10 years of experience, annotated all cases for the presence of effusion.Effusion was defined as a well-defined, rounded, homogeneous soft tissue density within the suprapatellar recess on a lateral radiograph.Consequently, the training set included 496 (48%) normal cases and 530 (52%) effusion cases, while the test set included 121 (47%) normal cases and 134 (53%) effusion cases.Sample X-ray images of normal and effusion cases are shown in Figure 2.
Diagnostics 2024, 14, x FOR PEER REVIEW 3 of 1 As shown in Figure 1, 300 randomly selected effusion cases in the training set were annotated with bounding boxes (bbox) around the patella by a medical AI researcher to train a patella detection model.The dataset was then divided into a training set of 200 cases (67%) and a test set of 100 cases (33%).Additionally, three orthopedic physicians each with more than 10 years of experience, annotated all cases for the presence of effu sion.Effusion was defined as a well-defined, rounded, homogeneous soft tissue density within the suprapatellar recess on a lateral radiograph.Consequently, the training set in cluded 496 (48%) normal cases and 530 (52%) effusion cases, while the test set included 121 (47%) normal cases and 134 (53%) effusion cases.Sample X-ray images of normal and effusion cases are shown in Figure 2.

X-ray Acquisition Parameters
The X-ray images were taken in the lateral decubitus position, and detailed infor mation for each hospital is provided in Table 1.Due to privacy concerns, the images were collected in the Joint Photographic Experts Group (JPEG) format, limiting the availability of further details.

X-ray Acquisition Parameters
The X-ray images were taken in the lateral decubitus position, and detailed information for each hospital is provided in Table 1.Due to privacy concerns, the images were collected in the Joint Photographic Experts Group (JPEG) format, limiting the availability of further details.

Methodology
We proposed a method that classifies the presence of knee effusion and enables the visualization of the effusion area.Our proposed architecture is depicted in Figure 3.

Methodology
We proposed a method that classifies the presence of knee effusion and enables the visualization of the effusion area.Our proposed architecture is depicted in Figure 3.

Knee Structure-Aware Image Preprocessing
To address variations in fields of view (FoV) and intensity levels caused by different acquisition protocols across institutions, we developed a robust preprocessing strategy.First, we addressed image intensity variations by removing background elements outside the body using a region-growing algorithm.Second, we constructed a deep learningbased patella detection model using the YOLO v8 [27] architecture to crop the effusion area.To standardize each predicted bounding box (bbox) of the patella, we first aligned the center of the bbox of all data to the average center position of the patella.Then, we rescaled the image based on the smallest bbox in the training set.After scaling, we added zero-padding to ensure that the image was centered.Subsequently, the image was cropped to a size of 1600 × 1600 pixels to preserve the area information of the effusion

Knee Structure-Aware Image Preprocessing
To address variations in fields of view (FoV) and intensity levels caused by different acquisition protocols across institutions, we developed a robust preprocessing strategy.First, we addressed image intensity variations by removing background elements outside the body using a region-growing algorithm.Second, we constructed a deep learning-based patella detection model using the YOLO v8 [27] architecture to crop the effusion area.To standardize each predicted bounding box (bbox) of the patella, we first aligned the center of the bbox of all data to the average center position of the patella.Then, we rescaled the image based on the smallest bbox in the training set.After scaling, we added zero-padding to ensure that the image was centered.Subsequently, the image was cropped to a size of 1600 × 1600 pixels to preserve the area information of the effusion without distortion.This process ensures a standardized input image that includes the effusion area with a uniform size.The results of our proposed preprocessing method are shown in Supplement S2.

DL Architecture
We conducted a comparative analysis of five different network models pre-trained on ImageNet [28]: VGG19 [29], ResNet50 [30], DenseNet121 [31], EfficientNet [32], and Vision Transformer (ViT) [33].The input consisted of preprocessed images derived from the original X-ray images, and the output was a continuous value between 0 and 1 representing the probability of effusion presence.The training set (n = 1026) was divided into a development set (n = 771, 75%) and a validation set (n = 255, 25%).For the qualitative analysis of the classification model, we compared various class activation map (CAM) methodologies and empirically selected Eigen-CAM [34] for its superior qualitative performance.

Model Specifications
This study utilized the PyTorch 2.0 framework to train a binary classification model with CrossEntropyLoss.The model was trained for 150 epochs with a batch size of 14, a learning rate of 0.001, and the SGD optimizer.Training was performed on an NVIDIA RTX A5000 24GB GPU with CUDA version 11.8 and an AMD EPYC 7452 32-core processor, sourced from COMPUWORKS Co., Seoul, Republic of Korea, and it took approximately 40 min.

Statistical Analysis
For the statistical analysis, the following software was used: R Core Team, 2024 (R: A Language and Environment for Statistical Computing.R Foundation for Statistical Computing, Vienna.https://www.R-project.org,accessed on 23 April 2024).DeLong's test [35] and McNemar's test [36] were used to compare the performances of the two models.A p value of less than 0.05 was considered statistically significant.

Performance of the Classification Models
We compared the classification performances of five different deep learning models using images without preprocessing.The DenseNet121 achieved the highest area under the receiver operating characteristic (ROC) curve (AUC) on the validation set, and the results are presented in Supplement S3.Therefore, we selected DenseNet121 as a baseline classification model to analyze the impacts of our proposed method.

Qualitative Results of the Classification Models
By applying a trained classification model that uses a binary label to indicate the presence of effusion, we generated Eigen-CAM images that highlight the effusion areas.These Eigen-CAM images emphasize regions related to effusion, typically located in the upper region of the knee joint.Figure 6 demonstrates the qualitative results, comparing Eigen-CAM with and without the knee structure-aware preprocessing, showing where the model identifies and emphasizes key areas using a heatmap.Additional results are illustrated in Supplement S4, and the results of comparing different CAM methodologies are shown in Supplement S5.

Qualitative Results of the Classification Models
By applying a trained classification model that uses a binary label to indicate the presence of effusion, we generated Eigen-CAM images that highlight the effusion areas.These Eigen-CAM images emphasize regions related to effusion, typically located in the upper region of the knee joint.Figure 6

Discussion
In this study, we proposed a novel method for classifying the absence or presence of knee effusion in radiographs.By applying the proposed method, the model's performance significantly improved, with an AUC of 0.892 compared to 0.821 for the model without our method.Additionally, our method significantly outperformed two non-orthopedic physicians in terms of accuracy, sensitivity, and specificity, achieving scores of 0.803, 0.820, and 0.785, respectively.These results demonstrate the potential of AI to facilitate the early and accurate classification of knee effusions.
Our findings reveal that while DenseNet121 has already shown robust performance in various clinical studies [37,38], our novel approach enhances the model's ability to discern the presence or absence of knee effusion in radiographs.The proposed preprocessing method optimizes the input data to enhance image features important for identifying

Discussion
In this study, we proposed a novel method for classifying the absence or presence of knee effusion in radiographs.By applying the proposed method, the model's performance significantly improved, with an AUC of 0.892 compared to 0.821 for the model without our method.Additionally, our method significantly outperformed two non-orthopedic physicians in terms of accuracy, sensitivity, and specificity, achieving scores of 0.803, 0.820, and 0.785, respectively.These results demonstrate the potential of AI to facilitate the early and accurate classification of knee effusions.
Our findings reveal that while DenseNet121 has already shown robust performance in various clinical studies [37,38], our novel approach enhances the model's ability to discern the presence or absence of knee effusion in radiographs.The proposed preprocessing method optimizes the input data to enhance image features important for identifying disease-specific conditions, focusing on knee regions related to effusion, such as the patella.In X-ray images, effusion can be very subtle and difficult to detect compared to MRI scans [10], even in patients with the disease.Therefore, it was necessary to utilize anatomically clear body structures for more robust standardization of the images and FoV.Accordingly, we devised a preprocessing method that detects the patella in lateral knee X-rays to precisely locate regions where effusion is likely to occur.This enables the model to perform more precise and accurate feature extraction and classification.Both qualitative and quantitative analysis showed that the preprocessing allowed for more nuanced interpretations of subtle clinical signs of effusion.
Currently, X-ray imaging is a major tool for the initial diagnosis of diseases due to its relatively low cost, minimal radiation exposure, and faster acquisition time [39].Therefore, diagnosis of knee joint disorders is widely based on X-ray images [12].One critical condition of knee joint disorders is effusion, which occurs outside the bones of the knee and can indicate other abnormalities within the joint [40].However, visually identifying effusion in X-ray images is challenging, especially in the early stages, making it particularly difficult for non-orthopedic physicians [41].In our physician evaluation, the results show that our model achieved higher diagnostic accuracy compared to nonorthopedic physicians.This may be due to two main reasons: First, the physicians involved in the study lacked specialized knowledge and experience in X-ray image interpretation, as they were not orthopedic surgeons familiar with arthroscopic surgery.Second, the dataset used in this study primarily consisted mostly of early-stage effusions, which can be more subtle and ambiguous to diagnose.Nevertheless, the AI model provided more accurate diagnoses because it has a superior ability to selectively focus on, interpret, and classify the unique patterns presented by effusions.Therefore, AI models can be used as a supportive computer-aided diagnosis system in other departments where diagnosing knee effusion is challenging for non-orthopedic surgeons.
Moreover, we were able to visualize the areas on which the AI model concentrated during effusion prediction by employing Eigen-CAM.The areas highlighted by Eigen-CAM accurately indicate the regions where effusion is present.This indicates that the model recognizes the visual patterns associated with the features of effusion locations.Nonetheless, effusion can be challenging to accurately capture using Eigen-CAM due to its blurred appearance compared to surrounding tissues and the unclear structure of the quadriceps tendon.However, Eigen-CAM emphasized the posterior quadriceps tendon and anterior patella, indicating that the model could consider thickened or indented areas of the femur or the synovial membrane as significant indicators.It might also consider the condition of the suprapatellar fat pad compressed by effusion fluid as a key factor in predicting the presence or absence of effusion.This visual interpretation offers insights into the model's decision-making process based on specific anatomical structures and features, helping clinicians trust and effectively use AI predictions.Additionally, determining the presence of effusion heavily depends on the clinician's experience.Therefore, Eigen-CAM can play an educational role by using visualization to help less experienced clinicians better understand the clinical signs of effusion.
The proposed methodology demonstrates promising clinical applicability in detecting knee effusion.This condition is closely associated with musculoskeletal pathologies, making the diagnosis of effusion crucial.The model can be effectively utilized to diagnose conditions related to knee joint effusion, such as OA and anterior cruciate ligament (ACL) tears.Furthermore, the proposed preprocessing methodology could be applied to other knee pathologies, including meniscal tears, tibial plateau fractures, ligament injuries, and patellar disorders.Moreover, the model's ability to focus on specific anatomical regions suggests its potential for diagnosing effusions in other joints, such as the talus in the ankle and the epicondyle in the elbow.This indicates that the model could be expanded into a useful tool for diagnosing a variety of joint-related diseases.
Additionally, the inference time of the proposed method is a vital component of this study.The proposed method performed predictions on 255 images in just 4.332 s (0.016 s per image), indicating its capability to provide rapid and accurate diagnoses.This rapid inference time significantly improves the clinical applicability of the model, especially in medical environments where timely diagnosis is essential for patient care and treatment.The AI model can provide highly accurate diagnoses in just a few seconds, greatly supporting medical professionals and streamlining the diagnostic process.This rapid assessment enables AI to screen patients who specifically need immediate attention from a physician.
Our study has several limitations: First, the comparison experiment between nonorthopedic physicians and AI involved a limited number of participants, making it difficult to generalize if our results are representative of all non-orthopedic physicians.Additionally, we did not conduct comparison experiments with orthopedic physicians who are experts in diagnosing effusion.Furthermore, a reader study will be necessary to assess the clinical utility of the developed computer-aided diagnosis system [42,43].Second, despite utilizing data from multiple centers, we aggregated all the data and randomly partitioned it into training and test sets.Therefore, we did not perform external validation.To evaluate the generalization performance of our model, we plan to establish an external validation dataset.Our model must perform well across diverse clinical settings, including handling knee images with features such as surgical scars or the poor-quality images that were excluded from this study.Therefore, we aim to enhance the model's effectiveness by testing its performance across various clinical conditions and anatomical regions.Third, while the Eigen-CAM provides a rough indication of the location of effusions, it does not reveal the specific interpretable features considered by the model in making a diagnosis.Therefore, our future work aims to develop a model that uses a large language model (LLM) guide to explain, in text, the reasons for diagnosing effusion or normal conditions [44,45].

Conclusions
This study demonstrated the capabilities of the proposed deep learning model in diagnosing knee effusion, with significantly better performance than both the state-of-theart deep-learning-based model and non-orthopedic physicians.The developed computeraided diagnosis system based on the proposed method would greatly help in accurately and rapidly screening patients with effusion, aided by the interpretable visualization map.
(a) incomplete visibility of effusion areas, (b) overlapping left and right knees in a single radiograph, (c) images that are blurred or excessively dark or bright, and (d) presence of orthopedic hardware such as K-wires (KW) around the patella.The remaining 1281 cases were randomly divided into an 80% of training set and a 20% of test set.The data flow diagram is illustrated in Figure 1.

Figure 1 .
Figure 1.Flowchart for study inclusion and exclusion.

Figure 1 .
Figure 1.Flowchart for study inclusion and exclusion.

Figure 2 .
Figure 2. Sample X-ray images of patients with knee: (a) normal case; (b) effusion case (the red bounding box indicates the area of effusion).

Figure 2 .
Figure 2. Sample X-ray images of patients with knee: (a) normal case; (b) effusion case (the red bounding box indicates the area of effusion).

Figure 3 .
Figure 3. Proposed architecture for knee joint effusion classification and visualization.

Figure 3 .
Figure 3. Proposed architecture for knee joint effusion classification and visualization.

Figure 4 .
Figure 4. (a) ROC curves; (b) Confusion matrix of the Densenet121; (c) Confusion matrix of the proposed method.

Figure 4 .
Figure 4. (a) ROC curves; (b) Confusion matrix of the Densenet121; (c) Confusion matrix of the proposed method.

Figure 5 .
Figure 5.Comparison of physician evaluations on the ROC curve (non-orthopedic physician 1: physical medicine and rehabilitation; non-orthopedic physician 2: occupational and environmental medicine).

Figure 5 .
Figure 5.Comparison of physician evaluations on the ROC curve (non-orthopedic physician 1: physical medicine and rehabilitation; non-orthopedic physician 2: occupational and environmental medicine).

Figure 6 .
Figure 6.Visualization results using Eigen-CAM: (a) true positive cases; (b) true negative cases.The highly important features considered by the model for prediction are highlighted in red.

Figure 6 .
Figure 6.Visualization results using Eigen-CAM: (a) true positive cases; (b) true negative cases.The highly important features considered by the model for prediction are highlighted in red.

:
Comparison of the Visibility of Effusion in MRI and X-ray Images of the Knee from the Same Patient: (a) knee effusion captured by radiograph, (b) the same effusion captured by MRI, Figure S2: Comparison of image cropping techniques and the proposed method: The case 2 and case 4 are images where the effusion area and the knee shape have been cropped out, Figure S3: Results of classical segmentation algorithms, Figure S4: Procedure of knee structure-aware image preprocessing: (a) original image, (b) region growing, (c) translation, (d) padding: resize and centering for patella size matching, (e) cropping, Figure S5.Results for each scenario (the red box represents the original image, and the blue box represents the translated image), Figure S6: Visualization results using Eigen-CAM: (a) false positive cases, (b) false negative cases, Figure S7: Comparison results of visualizations among different CAM methods; Table

Table 1 .
The demographic information and acquisition parameters of multi-center images.

Table 1 .
The demographic information and acquisition parameters of multi-center images.

Table 2 .
Comparison of each method's performance for classification; the highest values are boldfaced.
† p-values were calculated by DeLong's test in AUC and McNemar's test for the other metrics.CI, confidence interval.

Table 3 .
Comparison chart between the proposed method and physician evaluations.The highest values are bold faced (non-orthopedic physician 1: physical medicine and rehabilitation, non-orthopedic physician 2: occupational and environmental medicine).
† p-values were calculated by McNemar's test.CI, confidence interval.